Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Shelved] Add flushed_roots to dirty_store #4304

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

HaoranYi
Copy link

@HaoranYi HaoranYi commented Jan 6, 2025

Problem

When we flush slots from accounts cache, we should add them to "dirty_store" for clean to pick them up.

Summary of Changes

  • Add flushed slots to dirty store.

Fixes #

@@ -6295,33 +6295,42 @@ impl AccountsDb {
});

// Always flush up to `requested_flush_root`, which is necessary for things like snapshotting.
let cached_roots: BTreeSet<Slot> = self.accounts_cache.clear_roots(requested_flush_root);
let flushed_roots: BTreeSet<Slot> = self.accounts_cache.clear_roots(requested_flush_root);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename: cached_roots -> flushed_roots for clarity.

// Only add to the uncleaned roots set *after* we've flushed the previous roots,
// so that clean will actually be able to clean the slots.
let num_new_roots = cached_roots.len();
// Regardless of whether this slot was *just* flushed from the cache by
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An optimization to move the "max_flush_root" update out of the loop.
We only need to fetch_max with the max value in the "flushed_roots".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively this looks right, but I wonder if there was a reason to not originally do it this way... Maybe worth looking at the git history to see?

Copy link
Author

@HaoranYi HaoranYi Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was introduced 4 years ago, and has never changed since them. @carllin

solana-labs@8f7c7fb#diff-1090394420d51617f3233275c2b65ed706b35b53b115fe65f82c682af8134a6fR3017

@carllin Is there any reason that we can't move set_max_flush_root out of the loop?

@HaoranYi HaoranYi changed the title add dirty_store when flush Add flushed_roots to dirty_store Jan 6, 2025
Copy link

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks correct to me.

// Only add to the uncleaned roots set *after* we've flushed the previous roots,
// so that clean will actually be able to clean the slots.
let num_new_roots = cached_roots.len();
// Regardless of whether this slot was *just* flushed from the cache by

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intuitively this looks right, but I wonder if there was a reason to not originally do it this way... Maybe worth looking at the git history to see?

// the cache is overwhelmed, we flushed some yet to be rooted frozen slots
// These slots may then *later* be marked as root, so we still need to handle updating the
// `max_flush_root` in the accounts cache.
if let Some(storage) = self.storage.get_slot_storage_entry(root) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.uncleaned_pubkeys.insert(slot, dirty_keys);

These slots has already been added to uncleaned_pubkeys when calculating accounts_delta_hash.

When clean runs, it will include all these dirty pubkeys in the candidates.

self.uncleaned_pubkeys.remove(&uncleaned_slot)

Therefore, I think we don't need to add them to dirty store at flush. It seems redundant and is a waste of time to scan those stores again.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code to only added to dirty_stores when the slot is not found in uncleaned_pubkeys, and a stat to count how many time we added them.
I think it should be zero. I have started a node the verify this. We will know soon.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, mystery solved, thank you.

With the lt hash, we'll eventually remove the accounts delta hash calculation. At that point we'll need to fix this then.

Would it make sense to not add to uncleaned_pubkeys when calculating the accounts delta hash, and instead only add the storage to dirty_stores during flush?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we can do that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The experimental run on mainnet shows that the stats are indeed zeros. This means that we don't need to add them to dirty storage at flush time.

I will shelve this PR, and start two new other PRs for the obsolete comment change and the fetch_max optimization change.

@HaoranYi HaoranYi marked this pull request as draft January 7, 2025 20:43
@HaoranYi HaoranYi changed the title Add flushed_roots to dirty_store [Shelved] Add flushed_roots to dirty_store Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants